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Abstract 

The Balanced Domain Decomposition (BDD) method and the Finite Element Tear- 
ing and Interconnecting (FETI) method are two commonly used non-overlapping do- 
main decomposition methods. Due to strong theoretical and numerical similarities, 
these two methods are generally considered as being equivalently efficient. However, 
for some particular cases, such as for structures with strong heterogeneities, FETI 
requires a large number of iterations to compute the solution compared to BDD. In 
this paper, the origin of the bad efficiency of FETI in these particular cases is traced 
back to poor initial estimates of the interface stresses. To improve the estimation 
of interface forces a novel strategy for splitting interface forces between neighbor- 
ing substructures is proposed. The additional computational cost incurred is not 
significant. This yields a new initialization for the FETI method and restores nu- 
merical efficiency which makes FETI comparable to BDD even for problems where 
FETI was performing poorly. Various simple test problems are presented to discuss 
the efficiency of the proposed strategy and to illustrate the so-obtained numerical 
equivalence between the BDD and FETI solvers. 
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1 Introduction 



Domain Decomposition methods provide a natural framework to solve en- 
gineering problems decomposed into subparts. Such problems can arise for 
instance because each subdomain is discretized independently, or because the 
subdomains represent different physical domains. Decomposed domains can 
also be created from an initial single domain problem in order to make efficient 
use of parallel computing hardware, i.e. in order to distribute the computing 
work on several processors. 

Among the domain decomposition methods applied in engineering mechan- 
ics to solve elliptic linear problems, two similar methods have emerged in 
the last decade as efficient parallel computing methods: the primal and the 
dual Schur complement methods. More specifically, two procedures have been 
shown to ensure scalability and robustness: the Balanced Domain Decomposi- 
tion (BDD) method [1,2] and the Finite Element Tearing and Interconnecting 
(FETI) method [3,4]. The BDD method is a primal procedure where precondi- 
tioned conjugate gradient iterations are applied to find the interface displace- 
ments that satisfy the interface equilibrium. In the FETI approach, it is the 
interface forces that are searched for iteratively so as to satisfy the interface 
displacement compatibility. Therefore FETI is sometimes referred to as a dual 
method. 

The primal and dual Schur complement methods are based on very similar 
concepts (see e.g. [5] for a mechanical description). Mathematically, it has 
been shown that the preconditioned interface operators for both the BDD 
and FETI methods have a condition number bounded by [6,7] 



where H and h represent the subdomain and the mesh size respectively. Hence 
it is often accepted in the Domain Decomposition community that using one 
method instead of the other is a matter of taste and implementation pref- 
erences. However, when looking closely at the details and variants of both 
methods, it becomes rapidly clear that showing the exact equivalence between 
the primal BDD and the dual FETI is not easy, if at all possible [8]. 

Let us then consider the simple example depicted in Figure 1 of a highly het- 
erogeneous elastic cube (J^ = 10 5 ) subdivided into 3x3x3 subdomains. 
A uniform pressure is applied on the face opposite to the clamped side. The 
structure is discretized using Q 2 hexahedral finite elements with 27 nodes. 
The model contains 21000 degrees of freedom, of which 6000 belong to subdo- 
main interfaces. In Figure 2 the convergence curves of the global equilibrium 
residual corresponding to iterations of the BDD and the FETI methods are 
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plotted. Both methods are equipped with what literature refers to as the best 
preconditioners and coarse grids (see section 2). Although the asymptotic con- 
vergence of both procedures is similar, it is observed that the convergence of 
the FETI method is less monotonic and that its initial residual is significantly 
higher. 
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Fig. 1. Decomposed het- 
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Fig. 2. Convergence of BDD & FETI 



These results seem to indicate that some specific details such as the choice 
of the initial estimates result in possibly significant differences between the 
primal and dual algorithms. 

In this paper we revisit the way the interface forces are estimated initially in 
the FETI method in an attempt to obtain a convergence comparable to that 
of the BDD. 

In the next section, we shortly recall the concepts underlying the FETI solver. 
In section 3 we explain that the forces applied on the interface can be split 
in different ways. Although the final result is independent of that splitting, it 
affects the actual FETI iteration history. We then present an efficient way to 
define such a splitting and show how it is related to the construction of the 
initial iterate of FETI in section 4. Numerical examples are reported in section 
5 to illustrate the effectiveness of the new initialization strategy. Finally, we 
present some conclusions. 



2 FETI basics 



2.1 The decomposed problem 



Let us consider a domain Vt subdivided into N s non-overlapping subdomains 
ftW and assume that we are solving a linear (or linearized) static equilibrium 
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problem on the domain. The discretized subdomain equilibrium is expressed 
by 



K (S) U (S) = f(s) + g 



(s) 



h...N„ 



(2) 



where K^ s \ and are the subdomain stiffness matrices, displacements 
and applied forces respectively, are the connecting forces on the interface 
between subdomains (thus zero on the internal degrees of freedom). For the 
sake of simplicity, we assume in the following that the meshes are matching 
(conforming) on the interface. 

The interface forces satisfy an interface equilibrium equation expressing that 
when assembled on the interface, the resultant is null (action-reaction): 



(3) 



s=l 



where is a Boolean assembly matrix. The interface connecting forces are 
such that the interface degrees of freedom are compatible, namely 



N s 



£B«u« = o 



(4) 



s=l 



This relation expresses that for any pair (u^ s \ u^) of degrees of freedom 
matching and the interface, — = 0. are thus signed Boolean 

matrices expressing the compatibility constraints on the interface. 

The equilibrium problem of domain Q is fully described by the local equilib- 
rium (2) and by the interface constraints (3, 4). In block diagonal notations, 
it can be summarized as 

( Ku= f+g 
L T g = 
Bu = 



(5) 



where K is the block diagonal matrix of the local operators and where 
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Note that in this description, one set of interface displacements and one set 
of interface forces are defined per subdomain. 



2.2 Solvers for decomposed problems 



Solving (5) can be done in several ways: 

• Considering (5) as a constrained equilibrium problem in terms of u and g 
leads to the three-field formulation of decomposed domains (see e.g. [9,10]). 

• One can choose to work with a displacement set u that satisfies a priori the 
interface compatibility (4). For that purpose we define a global set u g of 
degrees of freedom unique on the interface such that 

u (s) = L (s) u g or u = Lu g (7) 

where is the same assembly Boolean matrix as in (3) that extracts 
subdomain degrees of freedom from the global set. Stating that are 
obtained from a unique set is obviously equivalent to stating the interface 
compatibility (4) and (7) thus implies 

Bu = BLug = (8) 

for any global displacement u g . On the other hand, all compatible displace- 
ments can be written as in (7). Hence 

L = null(B) (9) 

In order to illustrate these concepts, we reproduce the example of [11] in 
Figure 3. Introducing (7) in (5) yields 

KLu a — f + q 

< TT n (10) 
L T g = 

V 

This set of equations is at the basis of the primal iterative solution tech- 
niques such as the Primal Schur Complement or the BDD methods [1, 12]: 
iteration schemes are applied to find the displacements u g until the interface 
equilibrium L T g = L T (KLu g — f) — is satisfied. 

• One can choose in (5) a set of interface forces satisfying a priori the interface 
equilibrium L T g = while keeping redundant interface degrees of freedom 
in u. According to (9), such interface forces have the generic expression 

0« = _B« T A or g = -B T X (11) 

A are interface forces that act in opposite directions between any pair of 
matching degrees of freedom on the interface and are therefore in equilibrium 
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Fig. 3. Lagrange multipliers and interface compatibility 
(see Figure 3). Problem (5) becomes 

Ku + B T \ = f 

(12) 

Bu = 

Clearly, A are the Lagrange multipliers associated to the interface compat- 
ibility constraints. This form of the decomposed problem is the basis for 
the dual procedures such as FETI: iterative algorithms are applied to com- 
pute the interface forces A such that the displacements resulting from the 
subdomain equilibrium are compatible on the interface. 

• If one chooses interface displacements that are unique on part of the in- 
terface while, on the remainder of the interface, equilibrated connecting 
forces are defined, one obtains hybrid primal/dual approaches such as the 
FETI-DP procedure [13]. 

• If on the entire interface we use displacements and forces that satisfy a linear 
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combination of the interface equilibrium and compatibility (e.g. Robin type 
of boundary conditions), one obtains formulations typically used in wave 
propagation analysis such as described for instance by Helmholtz equations 
[14]. 

• Finally if both the interface equilibrium and compatibility are enforced a 
priori, one obtains the fully assembled form 

L T KLu g = L T f (13) 



2.3 FETI: the dual iterative solver 



In the FETI method [4], the decomposed problem (12) is expressed in terms 
of interface forces A: using the subdomain equilibrium equations to eliminate 

„« = ( /« - B« t a) - # (s) a (s) (14) 

where is the inverse of or a generalized inverse if subdomain 

is floating when disconnected from its neighbors. In the latter case, are 
the associated rigid body modes, their amplitudes being determined such 
that the interface forces are in equilibrium with the applied forces f( s \ i.e. 
such that the subdomain equilibrium is well-posed: 

R (*) T ff(s) _ B (s) T x) = (15) 



Substituting (14) into the interface compatibility condition, and taking ac- 
count of (15), one obtains the dual interface problem 
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where 
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F I = Y / B (s) K (s) B {s) =BK + B T 

s=l 

d=J2 B {s) K {s)+ f {s) = BKf 



s=l 
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Let use define the projection operator P such that GjP = 0: 

P(Q) = I-QG T (GjQGjy 1 Gj (17) 
The choice of the operator Q is discussed later. Introducing the splitting 



A = P(Q)A + A 
A = QG 7 (GjQG^e 



(18) 
(19) 



the dual interface problem (16) is equivalent to 
P(Q) T F / P(Q)A = P(Q) T (d - Fj\ ) 



(20) 



The FETI method consists in preconditioned conjugate gradient iterations on 
the dual interface problem (20) [4]. Applying Fj to an iterate is a naturally 
parallel operation. The projection steps however require solving a coarse global 
problem. The preconditioning operators and the possible choices for Q are 
summarized next. 



2.4 FETI preconditioners and coarse grid space 



The forces B T X exist only on the interface degrees of freedom. Using a sub- 
script b and j for interface boundary and internal degrees of freedom respec- 
tively, one can re- write the decomposed problem (12) by condensing out uf. 



Su b + BIX = ft 
B b u b = 



(21) 
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where B b is a submatrix of B associated to boundary displacements only and 



^bb ~~ ^bi ^ii 



-IV ^ 



(22) 



ft 



f, 



(s) 



ii 



(23) 



S is the block diagonal matrix of the local operators statically condensed on 
the interface (also known as Schur complements). The dual interface problem 
(16) or (20) can thus be written in equivalent forms by replacing K, f and 
B by S, f b * and B b respectively. 

The preconditioner in FETI for which the optimal conditioning number (1) 
holds is the Dirichlet preconditioner: 



~ T 

B b SB b 



(24) 



The operator B b is similar to B b but includes a scaling with respect to inter- 
face multiplicity or relative interface stiffness [15]. The scaling is implemented 
as a simple pre- and post-processing in the preconditioning step. Mathemati- 
cally, its general expression is [7, 11] 



B b = (B b ABl) + B b A 



(25) 



where the + superscript denotes an inverse or a pseudo-inverse in case redun- 
dant compatibility constraints are present. It can be shown that the multi- 
plicity scaling procedure corresponds to A — I, whereas the stiffness scaling 
(also known as super-lumped scaling) corresponds to A = diag(.K W) )~ 1 . 

The operator Q in the coarse grid projector (17) defines the interface forces 
QGi associated to rigid mode displacements of subdomains. It can be chosen 
as Q = I, but for many engineering problems it should be taken as the 
preconditioner, namely 

Q = Ff D l (26) 
or at least as a degenerated form of the preconditioner such as 



Q = (B b diag(K 66 )- 1 B b T ) 



+ 



(27) 



Further discussion on preconditioning, scaling and on the choice of Q can be 
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found in [4,11,15,16]. 



3 Splitting forces on an interface 

The definition of the decomposed hybrid problem (12) is not unique: any 
interface force satisfying the interface equilibrium can be added to the system 
without changing the final solution u. Indeed, replacing (12) by 

J J " (28) 

Bu= 

V 

for any /x will yield the same solution u, the Lagrange multiplier being then 
such that A = X+fJ,. This can also be observed from the fact that (12) and (28) 
have the same assembled forces since L T f = L T f. Mechanically speaking, it 
means that adding forces on one side of the interface boundary and subtracting 
it on the other side yields the same solution in terms of displacement although 
it modifies the internal forces on the interface. 



3.1 Force splitting commonly used 

From the discussion above, it is clear that one can choose the splitting of 
interface forces on the interface boundaries 1 . Commonly, interface forces are 
split between the connecting subdomains proportionally to subdomain relative 
stiffness, in a way consistent with the scaling in the preconditioner. Mathe- 
matically speaking, this is equivalent to choosing 

/ = diag(K)L (L T diag(K)L) _1 (L T f) (29) 

where (-^ T /) are the applied forces given in an initially assembled problem or 
the assembled forces corresponding to a different force splitting. (L T &mg(K)L) 
is the assembled diagonal stiffness on the interface. 

Let us note that for any symmetric positive definite matrix A, the following 



1 In the same manner, it was noted in [11] that coefficients of additional constraints 
can also be arbitrarily split on the interface, which led to the construction of efficient 
preconditioners for problems with multipoint constraints. 
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relation holds: 2 

AL (i/ALy 1 L T + B T (BA- 1 B T ) + BA 1 = I (30) 
From (30), one deduces that the force splitting (29) is equivalent to 

f = f-B T (Bdiag(K)- 1 B T ) + J Bdiag(K)- 1 / (31) 

This last relation shows that the common splitting technique (29) corresponds 
to adding a particular set of equilibrated interface forces as described by (28). 



3.2 Splitting statically condensed forces 

In section 2.4, we indicated that the hybrid decomposed problem can be set 
in an equivalent form condensed on the interface (21). Following the same 
splitting procedure for the force as in (29), one would then construct the 
problem 

+ = * (32) 

B b u b = 

V 

where 

ft = dmg(K bb )L b (L T b dmgiK^L.y 1 (Lift) (33) 

and where (L b ft) are the applied forces statically condensed and assembled 
on the interface. Following a similar discussion as in the previous section, it is 
straightforward to show that such a splitting is equivalent to 

ft = ft - Bl (B b dmg(K bb )- l Bl) + B b diag^)- 1 /* (34) 

indicating again that the splitting corresponds to adding a particular set of 
equilibrated interface forces. 

Let us observe that, in the non-condensed format, (32) is identical to 



2 This relation expresses the complementarity of the primal and dual scaling and 
was already suggested in [8]. Its proof can be obtained by noting that L T (?>{)) and 
SA _1 (30) are trivially satisfied. Owing to the fact that A is symmetric positive 
definite and because the image of L corresponds to the nullspace of £?, the final 
result is obtained. 
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Ku + B X 



f-B T ( J Bdiag(K)- 1 J B T ) + Bdiag(K)- 1 



(35) 





ft 



When comparing it with (31), it is clear that splitting the condensed forces is 
not equivalent to splitting / as commonly done. Although the displacement 
field obtained is obviously the same, the Lagrange multipliers searched for 
during the FETI iterations will be different. The question thus arises: what 
splitting should be used and how does it affect the FETI iterations on the dual 
interface problem? Next section provides a theoretical analysis of the interest 
of the splitting while the section after provides related numerical assessments. 



4 An improved initial estimate of the Lagrange multipliers 



Let us consider again the hybrid decomposed problem (12). In order to con- 
struct an estimate for the interface Lagrange multipliers, let us assume that 
the internal degrees of freedom Ui satisfy the local equilibrium while u b on the 
interface boundary have a zero estimate. This is exactly the assumption un- 
derlying the initialization of the primal Schur complement iteration schemes. 
The sub domain equilibrium then writes 

B T b \ * f* b (36) 

which cannot be exactly satisfied (unless u b = corresponds to the solution). 
Hence, let us decompose the statically condensed force f b * into a component 
satisfying the interface equilibrium (i.e. belonging to the image of B%) and a 
remainder: 

Bl\ ~ f* = B%j + remainder (37) 

where 

1 =(B b DBl) + B b Df* b (38) 

for any non-singular and symmetric matrix D. The remainder is then orthog- 
onal to DB]j . If we choose as initial estimate 

Aoo = 7 (39) 

then the initial equilibrium residual -B^Aoo — f b = remainder is minimum in 
the iD-norm. Setting D to be diag(-K^ ) )~ 1 

Aoo = (B b dmg{K bb )- l Bl) + B b &mg(K hb )- l r b (40) 
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can be easily computed by a scaling of the interface condensed forces. Observ- 
ing that diag(-K"f,t) _1 is an approximation of S + , Ao minimizes a norm having 
the meaning of an energy. 

We thus propose to start the FETI iterations with an initial estimate obtained 
by rendering A o admissible, i.e. such that Gj\ = e. Using the notations 
introduced in (17), 

A = P(Q) A 00 + QG T (G%QGj) ' e (41) 

To be consistent with the choice D = diag(K" fe6 ) _1 , one should take Q as in 
(26) or (27). 



Remarks 

• If D = S + in (38), Aoo = Fj + d. Furthermore, one would have Q = F[ + 
and 

A = P(Q)F I + d + F I + G I (GjF I + G i y 1 e 

= Fr + (d-Gra) (42) 
with a = (GjF 7 + G 7 ) _1 (GjF^d - e) 

Hence the new choice of initial estimate would yield the exact solution. 

• If we choose to split the interface forces such as described in (33) (or its 
equivalent form 34), (40) yields A o = so that the initial iterate (41) would 
correspond to the standard FETI starting procedure. Hence it is equivalent 
to apply standard FETI iterations when splitting condensed forces on the 
interface as in (33) or to use any decomposed force vector f together with 
the starting procedure (4-1)- 

• The construction of A o in (40) has exactly the same mechanical interpre- 
tation as the scaling of interface forces in the preconditioner (see [11]). The 
cost incurred by (33) or (41), corresponds to a preconditioning step, thus less 
than one half of a FETI iteration. Obviously, if the lumped preconditioner 
is applied, the standard splitting (29) should be considered. 



5 Numerical assessment 



In order to assess the performance of the new estimate for the Lagrange multi- 
pliers, we consider the solution to various problems by FETI and BDD. In all 
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cases, convergence is monitored through the evaluation of the global residual: 



\ K 9 u g- fg 



< e (43) 



where K g , u g and f g are global assembled stiffness matrix, displacement field 
and forces (see (13)). All results are obtained with the Dirichlet preconditioner 
equipped with the super-lumped scaling, i.e. A = diag(-K 6 5)~ 1 . The stopping 
criterion e is set to 10~ 6 . The FETI method is tested for its different projectors: 
we note P(Ff^) the Dirichlet projector, P(W) the superlumped projector 
corresponding to (27) and P(I) the identity projector. 



5.0.0.1 Cube with checkerboard heterogeneity: Let us first assess 
the strategies described above on the problem described in the introduction 
(see fig. 1). Figure 4 presents the convergence of the primal residual through 
the conjugate gradient iterations for the BDD and for the FETI approaches 
with standard force splitting. The primal approach clearly exhibit better re- 
sults. Also, due to the strong heterogeneity in the structure and along the 
interfaces, the projector P(I) yields very poor convergence and is thus not 
suitable. It is observed that, although the Dirichlet projector yields a better 
convergence rate, the superlumped projector P(W) has a significantly lower 
initial residual and hence reaches convergence faster. The initialization asso- 
ciated to P(W) leads to a starting residual 10 3 times smaller, however its 
behaviour during the solution process is less regular. 
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Fig. 4. Cube with checkerboard heterogeneity (fig. 1): convergence of the primal 
residual for FETI with standard force splitting 



Figure 5 presents the convergence history for the BDD method and for the 
FETI method equipped with the Dirichlet projector and with the new initial- 
ization. For comparison, the convergence obtained with the standard splitting 
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is shown once more. Note that the new initialization not only provides a better 
initial estimate (lower initial residual) but also leads to similar convergence 
rates as for the standard splitting. Therefore the convergence history of FETI 
becomes very similar to the convergence of the primal BDD approach. The 
total number of iterations is about the same as for the primal BDD, although 
the convergence of FETI is less monotonic. 
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Fig. 5. Cube with checkerboard heterogeneity (fig. 1): convergence of the primal 
residual for FETI with new initialization 



Approach 


Number of 
iterations 


Initial Residual 
(Log) 


BDD 


19 


0.659 


FETI 
Dirichlet 


No Splitting 


28 


4.428 


Classical Splitting 


28 


4.377 


New Initialization 


18 


0.359 


FETI 
Dirichlet 
P(W) 


No Splitting 


21 


0.873 


Classical Splitting 


21 


0.872 


New Initialization 


20 


0.868 


FETI 
Dirichlet 
P(I) 


No Splitting 


74 


5.029 


Classical Splitting 


74 


5.026 


New Initialization 


73 


5.016 



Table 1 

Performance results for the cube of fig. 1 

Table 1 summarizes the performance results of the various available strate- 
gies. To investigate the efficiency of the new initialization (or force splitting) 



15 



procedure, we compare the number of iterations to achieve convergence and 
the norm of the initial primal residual. The new initialization yields signifi- 
cant improvements for the Dirichlet projector (35% less iterations), but only 
slightly affects the convergence when other projectors are used. 



5.0.0.2 Cube and slanted cube with heterogeneous layers: We now 

assess the new initialization strategy on two other configurations of the cube 
in order to evaluate the influence of the geometry and of the repartition of 
heterogeneities. The structures depicted in Figure 6 and 7 are similar to fig. 
1 but with different material distribution. Also, for the problem described in 
Figure 7, the cube has been slanted by 60 degrees. In table 2 we report the 
number of iterations when using the BDD solver and when applying FETI 
with the standard and the new initialization. 
















El 


E2 







Fig. 6. Cube with heterogeneous Fig. 7. Slanted cube with hetero- 
layers geneous layers 

For the straight cube of fig. 6, the use of the novel initialization leads to only 
small improvements. Observing that the FETI solver with Dirichlet projector 
and standard splitting converges nearly as fast as the BDD, it is clear that 
the new splitting strategy can not have a significant effect. 

For the slanted cube, due to the geometric distortion of the substructures and 
of the mesh, the BDD convergence is better than when the FETI with the 
standard splitting is applied. When the new initialization strategy is used, the 
convergence history of FETI is again very similar to the convergence of the 
primal BDD. 



5.0.0.3 Nonlinear flexion of a composite beam As a last example, we 
analyze a non-linear problem solved through a sequence of linearized systems. 
The structure (fig. 8) is a slender beam of aspect ratio 9 with square cross- 
section. It is made of longitudinal strips of metal and rubber. The loading 
corresponds to an imposed pressure on one side of the beam. Due to the pres- 
ence of elastomer parts, the structure undergoes large deformations. We chose 
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Approach 


Cube, fig. 6 


Slanted cube, fig. 7 






Num. of it. 


Num. of it. 


BDD 


19 


73 




iNo spurting 


20 


85 


Dirichlet 


Classical Splitting 


21 


85 




incw iniiiaiiZHiioii 


19 


73 


FETI 


No Splitting 


22 


88 


Dirichlet 


Classical Splitting 


22 


88 


P(W) 


New Initialization 


22 


85 


FETI 


No Splitting 


92 


153 


Dirichlet 


Classical Splitting 


92 


154 


P(I) 


New Initialization 


90 


159 



Table 2 

Performance results on problem fig. 6 and 7 

a Kirchoff Saint- Venant model for the metal, the characteristic coefficients 
of which are its Young modulus E = 20000 MPa and Poisson's coefficient 
v = 0.3. A NeoHookian model is assumed for the rubber characterized by a 
shear modulus G = 2.0 MPa and a compressibility modulus K = 2000 MPa. 
The structure is decomposed into 27 monomaterial paralellepipedic subdo- 
mains and each substructure is meshed in 2 x 2 x 18 cubic elements. In order 
to handle the quasi-incompressibility of the elastomer, a mixed finite element 
formulation is considered where the pressure field and displacements are dis- 
cretized independently. We choose a Q 2 — Pi hexaedral element with 27 dis- 
placement nodes and 4 internal pressure nodes. Details on this formulation 
and on the practicalities for applying it properly in simulation can be found 
in [17]. 

The complete model has 55300 degrees of freedom of which 16400 belong to 
the interface. Due to the behaviour of the elastomer, the problem is highly 
nonlinear. Newton-Raphson iterations are performed where the tangent ma- 
trix is updated at every step. The pressure loading is 5 bars. The stopping 
criterion is set to 10~ 3 for the relative primal residual of FETI when solv- 
ing the linearized systems. For the Newton-Raphson iterations, the tolerance 
for the relative residual of the nonlinear equations is set to 10~ 4 , so that 5 
Newton-Raphson iterations and thus 5 linear solves must be performed. 

Figure 9 indicates the number of iterations required when solving the linearized 
systems by the classical FETI method, by FETI with improved initialization 
and by the primal BDD. The Dirichlet and diagonal projectors are applied, 
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and the Dirichlet preconditioner with stiffness scaling is used. The signifi- 
cant increase of the number of iterations between the first and the following 
linearized systems can be explained by the loss of positivity of the tangent 
matrix which is mostly due to incompressibility. Indeed, the conjugate gradi- 
ent algorithm (with full re-orthoganilization) applied on the interface problem 
in the FETI and in the BDD methods remains applicable for non positive 
matrices but its convergence is significantly slowed down [18]. 3 As observed 
from figure 9, the new initialization enables FETI to achieve performance re- 
sults which are very similar to the BDD method. Comparing the total number 
of iterations, the classical FETI approach requires 10% more iterations than 
BDD while the FETI method with new initialization requires slightly less iter- 
ations than the BDD. From figure 9 we also observe that the new initialization 
technique improves the performance of FETI both for the Dirichlet and the 
diagonal projector. For this particular structure with regular geometry, the 
cost effective diagonal projector yields a convergence rate very similar to the 
convergence rate obtained with the more computationally intensive Dirichlet 
projector, except for the very first linearized system solve. 

Our numerical experiments indicate that for a large class of nonlinear problems 
such as the one depicted here, the new initialization leads to a small but 
non-negligible gain in terms of number of iterations and CPU time. Another 
important beneficial effect of the new starting procedure for FETI comes from 
the fact that, since the initial residual of the iterations on the interface problem 
are several orders of magnitude lower with the new initialization, stagnation 
of the residual of the FETI iterations which often happens when dealing with 
higher nonlinearity (higher loading) is significantly delayed, so that in practice 
restarting of the iteration can be avoided. 



3 Often, non-positivity is due to the mutation of former null-modes (rotations) to 
negative modes, it can be handled by the introduction of these modes as constraints 
in Krylov-augmented algorithms [19]. In our case, non-positivity is mostly due to 
the behaviour of the rubber and the strategy described above is non-relevant. An 
efficient strategy based on the approximation of negative modes can be found in [20] . 
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6 Conclusions 

For some particular structural problems such as those exhibiting strong het- 
erogeneities and geometric distortion, the Finite Element Tearing and Inter- 
connecting (FETI) solver can yield poor convergence compared to the concep- 
tually similar Balanced Domain Decomposition (BDD) solver. In those cases, 
the bad performance of FETI can be traced back to high initial residual in the 
iterations on the interface problem. The initial residual is strongly related to 
the way the applied forces are split on the subdomain interface boundaries. 

In this paper, we propose a novel strategy to split the applied forces between 
the subdomains. We propose to split the statically condensed interface force 
according to interface diagonal stiffness. This leads to building a more efficient 
initial estimate for the interface connecting forces. The new initialization for 
the FETI iterations mainly involves computing statically condensed forces on 
the interface and can thus be performed at a computational cost equivalent 
to less than half the cost of a full FETI iteration. 

The numerical examples described in this paper indicate that for the problems 
where the primal BDD method outperforms FETI due to unexpected high 
initial residual, the new initialization strategy builds a better starting estimate 
of the interface forces and, in turn, to an initial residual similar to the BDD 
residual. The FETI method then converges in a manner very similar to the 
BDD method. 

The novel starting strategy never deteriorates the FETI convergence and leads 
to significant improvements in some pathological cases. Therefore we suggest 
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to use the presented initialization as default in FETI solvers. 

With the proposed initialization for FETI, the FETI method and the BDD 
solver lead to similar convergence and computational costs for complex prob- 
lem where the Dirichlet projector is required. For problems where the sim- 
plified FETI preconditioners and projectors can be used without significantly 
deteriorating the convergence of the interface iterations, FETI is often found 
to be more efficient in terms of overall computing cost. 
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Abstract 

The Balanced Domain Decomposition (BDD) method and the Finite Element Tear- 
ing and Interconnecting (FETI) method are two commonly used non-overlapping do- 
main decomposition methods. Due to strong theoretical and numerical similarities, 
these two methods are generally considered as being equivalently efficient. However, 
for some particular cases, such as for structures with strong heterogeneities, FETI 
requires a large number of iterations to compute the solution compared to BDD. In 
this paper, the origin of the bad efficiency of FETI in these particular cases is traced 
back to poor initial estimates of the interface stresses. To improve the estimation 
of interface forces a novel strategy for splitting interface forces between neighbor- 
ing substructures is proposed. The additional computational cost incurred is not 
significant. This yields a new initialization for the FETI method and restores nu- 
merical efficiency which makes FETI comparable to BDD even for problems where 
FETI was performing poorly. Various simple test problems are presented to discuss 
the efficiency of the proposed strategy and to illustrate the so-obtained numerical 
equivalence between the BDD and FETI solvers. 



Key words: domain decomposition, iterative solver, FETI, Schur complement, 
force splitting 
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1 Introduction 



Domain Decomposition methods provide a natural framework to solve en- 
gineering problems decomposed into subparts. Such problems can arise for 
instance because each subdomain is discretized independently, or because the 
subdomains represent different physical domains. Decomposed domains can 
also be created from an initial single domain problem in order to make efficient 
use of parallel computing hardware, i.e. in order to distribute the computing 
work on several processors. 

Among the domain decomposition methods applied in engineering mechan- 
ics to solve elliptic linear problems, two similar methods have emerged in 
the last decade as efficient parallel computing methods: the primal and the 
dual Schur complement methods. More specifically, two procedures have been 
shown to ensure scalability and robustness: the Balanced Domain Decomposi- 
tion (BDD) method [?,?] and the Finite Element Tearing and Interconnecting 
(FETI) method [?,?]. The BDD method is a primal procedure where precondi- 
tioned conjugate gradient iterations are applied to find the interface displace- 
ments that satisfy the interface equilibrium. In the FETI approach, it is the 
interface forces that are searched for iteratively so as to satisfy the interface 
displacement compatibility. Therefore FETI is sometimes referred to as a dual 
method. 

The primal and dual Schur complement methods are based on very similar 
concepts (see e.g. [?] for a mechanical description). Mathematically, it has 
been shown that the preconditioned interface operators for both the BDD 
and FETI methods have a condition number bounded by [?,?] 



where H and h represent the subdomain and the mesh size respectively. Hence 
it is often accepted in the Domain Decomposition community that using one 
method instead of the other is a matter of taste and implementation pref- 
erences. However, when looking closely at the details and variants of both 
methods, it becomes rapidly clear that showing the exact equivalence between 
the primal BDD and the dual FETI is not easy, if at all possible [?]. 

Let us then consider the simple example depicted in Figure 1 of a highly het- 
erogeneous elastic cube (J^ = 10 5 ) subdivided into 3x3x3 subdomains. 
A uniform pressure is applied on the face opposite to the clamped side. The 
structure is discretized using Q 2 hexahedral finite elements with 27 nodes. 
The model contains 21000 degrees of freedom, of which 6000 belong to subdo- 
main interfaces. In Figure 2 the convergence curves of the global equilibrium 
residual corresponding to iterations of the BDD and the FETI methods are 
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plotted. Both methods are equipped with what literature refers to as the best 
preconditioners and coarse grids (see section 2). Although the asymptotic con- 
vergence of both procedures is similar, it is observed that the convergence of 
the FETI method is less monotonic and that its initial residual is significantly 
higher. 
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Fig. 2. Convergence of BDD & FETI 



These results seem to indicate that some specific details such as the choice 
of the initial estimates result in possibly significant differences between the 
primal and dual algorithms. 

In this paper we revisit the way the interface forces are estimated initially in 
the FETI method in an attempt to obtain a convergence comparable to that 
of the BDD. 

In the next section, we shortly recall the concepts underlying the FETI solver. 
In section 3 we explain that the forces applied on the interface can be split 
in different ways. Although the final result is independent of that splitting, it 
affects the actual FETI iteration history. We then present an efficient way to 
define such a splitting and show how it is related to the construction of the 
initial iterate of FETI in section 4. Numerical examples are reported in section 
5 to illustrate the effectiveness of the new initialization strategy. Finally, we 
present some conclusions. 



2 FETI basics 



2.1 The decomposed problem 



Let us consider a domain Vt subdivided into N s non-overlapping subdomains 
ftW and assume that we are solving a linear (or linearized) static equilibrium 
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problem on the domain. The discretized subdomain equilibrium is expressed 
by 



K (S) U (S) = f(s) + g 



(s) 



h...N„ 



(2) 



where K^ s \ and are the subdomain stiffness matrices, displacements 
and applied forces respectively, are the connecting forces on the interface 
between subdomains (thus zero on the internal degrees of freedom). For the 
sake of simplicity, we assume in the following that the meshes are matching 
(conforming) on the interface. 

The interface forces satisfy an interface equilibrium equation expressing that 
when assembled on the interface, the resultant is null (action-reaction): 



(3) 



s=l 



where is a Boolean assembly matrix. The interface connecting forces are 
such that the interface degrees of freedom are compatible, namely 



N s 



£B«u« = o 



(4) 



s=l 



This relation expresses that for any pair (u^ s \ u^) of degrees of freedom 
matching and the interface, — = 0. are thus signed Boolean 

matrices expressing the compatibility constraints on the interface. 

The equilibrium problem of domain Q is fully described by the local equilib- 
rium (2) and by the interface constraints (3, 4). In block diagonal notations, 
it can be summarized as 

( Ku= f+g 
L T g = 
Bu = 



(5) 



where K is the block diagonal matrix of the local operators and where 
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(6) 
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Note that in this description, one set of interface displacements and one set 
of interface forces are defined per subdomain. 



2.2 Solvers for decomposed problems 



Solving (5) can be done in several ways: 

• Considering (5) as a constrained equilibrium problem in terms of u and g 
leads to the three-field formulation of decomposed domains (see e.g. [?,?]). 

• One can choose to work with a displacement set u that satisfies a priori the 
interface compatibility (4). For that purpose we define a global set u g of 
degrees of freedom unique on the interface such that 

u (s) = L (s) u g or u = Lu g (7) 

where is the same assembly Boolean matrix as in (3) that extracts 
subdomain degrees of freedom from the global set. Stating that are 
obtained from a unique set is obviously equivalent to stating the interface 
compatibility (4) and (7) thus implies 

Bu = BLug = (8) 

for any global displacement u g . On the other hand, all compatible displace- 
ments can be written as in (7). Hence 

L = null(B) (9) 

In order to illustrate these concepts, we reproduce the example of [?] in 
Figure 3. Introducing (7) in (5) yields 

KLu a — f + q 

< TT n (10) 
L T g = 

V 

This set of equations is at the basis of the primal iterative solution tech- 
niques such as the Primal Schur Complement or the BDD methods [?,?]: 
iteration schemes are applied to find the displacements u g until the interface 
equilibrium L T g = L T (KLu g — f) — is satisfied. 

• One can choose in (5) a set of interface forces satisfying a priori the interface 
equilibrium L T g = while keeping redundant interface degrees of freedom 
in u. According to (9), such interface forces have the generic expression 

g W = -B^ T X or g = -B T \ (11) 

A are interface forces that act in opposite directions between any pair of 
matching degrees of freedom on the interface and are therefore in equilibrium 
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Fig. 3. Lagrange multipliers and interface compatibility 
(see Figure 3). Problem (5) becomes 



Ku + B T \ = f 
Bu = 



(12) 



Clearly, A are the Lagrange multipliers associated to the interface compat- 
ibility constraints. This form of the decomposed problem is the basis for 
the dual procedures such as FETI: iterative algorithms are applied to com- 
pute the interface forces A such that the displacements resulting from the 
subdomain equilibrium are compatible on the interface. 
If one chooses interface displacements that are unique on part of the in- 
terface while, on the remainder of the interface, equilibrated connecting 
forces are defined, one obtains hybrid primal/dual approaches such as the 
FETI-DP procedure [?]. 

If on the entire interface we use displacements and forces that satisfy a linear 
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combination of the interface equilibrium and compatibility (e.g. Robin type 
of boundary conditions), one obtains formulations typically used in wave 
propagation analysis such as described for instance by Helmholtz equations 

[?]■ 

• Finally if both the interface equilibrium and compatibility are enforced a 
priori, one obtains the fully assembled form 

L T KLu g = L T f (13) 



2.3 FETI: the dual iterative solver 



In the FETI method [?], the decomposed problem (12) is expressed in terms 
of interface forces A: using the subdomain equilibrium equations to eliminate 

„« = K« + ( /(•) - b« t a) - JlWaW (14) 

where is the inverse of or a generalized inverse if subdomain 

is floating when disconnected from its neighbors. In the latter case, are 
the associated rigid body modes, their amplitudes being determined such 
that the interface forces are in equilibrium with the applied forces f( s \ i.e. 
such that the subdomain equilibrium is well-posed: 

R (*) T ff(s) _ B (s) T x) = (15) 



Substituting (14) into the interface compatibility condition, and taking ac- 
count of (15), one obtains the dual interface problem 
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(16) 



where 
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F I = Y / B (s) K (s) B {s) =BK + B T 

s=l 

d=J2 B {s) K {s)+ f {s) = BKf 



s=l 



G I = 


bWrw ... b^rw 


= BR 








" J R( 1 ) T / {1) 
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a (Ns) 




R (N a rf(N s ) 



R T f 



Let use define the projection operator P such that GjP = 0: 

P(Q) = I-QG T (GjQGjy 1 Gj (17) 
The choice of the operator Q is discussed later. Introducing the splitting 



A = P(Q)A + A 
A = QG 7 (GjQG^e 



(18) 
(19) 



the dual interface problem (16) is equivalent to 
P(Q) T F / P(Q)A = P(Q) T (d - Fj\ ) 



(20) 



The FETI method consists in preconditioned conjugate gradient iterations on 
the dual interface problem (20) [?]. Applying Fj to an iterate is a naturally 
parallel operation. The projection steps however require solving a coarse global 
problem. The preconditioning operators and the possible choices for Q are 
summarized next. 



2.4 FETI preconditioners and coarse grid space 



The forces B T X exist only on the interface degrees of freedom. Using a sub- 
script b and j for interface boundary and internal degrees of freedom respec- 
tively, one can re- write the decomposed problem (12) by condensing out uf. 



Su b + BIX = ft 
B b u b = 



(21) 
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where B b is a submatrix of B associated to boundary displacements only and 



Tf(s) JV-(S) |V-(S) 1 |^(S) 



(22) 



0) R-W i^O)" 1 f 00 



(23) 



5 is the block diagonal matrix of the local operators statically condensed on 
the interface (also known as Schur complements). The dual interface problem 
(16) or (20) can thus be written in equivalent forms by replacing K, f and 
B by S, f b and B b respectively. 

The preconditioner in FETI for which the optimal conditioning number (1) 
holds is the Dirichlet preconditioner: 



~ T 

B b SB b 



(24) 



The operator B b is similar to B b but includes a scaling with respect to inter- 
face multiplicity or relative interface stiffness [?]. The scaling is implemented 
as a simple pre- and post-processing in the preconditioning step. Mathemati- 
cally, its general expression is [?,?] 



B b = (B b ABl) + B b A 



(25) 



where the + superscript denotes an inverse or a pseudo-inverse in case redun- 
dant compatibility constraints are present. It can be shown that the multi- 
plicity scaling procedure corresponds to A — J, whereas the stiffness scaling 
(also known as super-lumped scaling) corresponds to A = diag(.K W) )~ 1 . 

The operator Q in the coarse grid projector (17) defines the interface forces 
QGi associated to rigid mode displacements of subdomains. It can be chosen 
as Q — I, but for many engineering problems it should be taken as the 
preconditioner, namely 

Q = Ff D l (26) 
or at least as a degenerated form of the preconditioner such as 



Q = (B b diag(K 66 )- 1 B b T ) 



+ 



(27) 



Further discussion on preconditioning, scaling and on the choice of Q can be 
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found in [?,?,?,?]. 



3 Splitting forces on an interface 

The definition of the decomposed hybrid problem (12) is not unique: any 
interface force satisfying the interface equilibrium can be added to the system 
without changing the final solution u. Indeed, replacing (12) by 

J J " (28) 

Bu= 

V 

for any /x will yield the same solution u, the Lagrange multiplier being then 
such that A = X+fJ,. This can also be observed from the fact that (12) and (28) 
have the same assembled forces since L T f = L T f. Mechanically speaking, it 
means that adding forces on one side of the interface boundary and subtracting 
it on the other side yields the same solution in terms of displacement although 
it modifies the internal forces on the interface. 



3.1 Force splitting commonly used 

From the discussion above, it is clear that one can choose the splitting of 
interface forces on the interface boundaries 1 . Commonly, interface forces are 
split between the connecting subdomains proportionally to subdomain relative 
stiffness, in a way consistent with the scaling in the preconditioner. Mathe- 
matically speaking, this is equivalent to choosing 

/ = diag(K)L (L T diag(K)L) _1 (L T f) (29) 

where (-^ T /) are the applied forces given in an initially assembled problem or 
the assembled forces corresponding to a different force splitting. (L T &mg(K)L) 
is the assembled diagonal stiffness on the interface. 

Let us note that for any symmetric positive definite matrix A, the following 



1 In the same manner, it was noted in [?] that coefficients of additional constraints 
can also be arbitrarily split on the interface, which led to the construction of efficient 
preconditioners for problems with multipoint constraints. 
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relation holds: 2 

AL (i/ALy 1 L T + B T (BA- 1 B T ) + BA 1 = I (30) 
From (30), one deduces that the force splitting (29) is equivalent to 

f = f-B T (Bdiag(K)- 1 B T ) + J Bdiag(K)- 1 / (31) 

This last relation shows that the common splitting technique (29) corresponds 
to adding a particular set of equilibrated interface forces as described by (28). 



3.2 Splitting statically condensed forces 

In section 2.4, we indicated that the hybrid decomposed problem can be set 
in an equivalent form condensed on the interface (21). Following the same 
splitting procedure for the force as in (29), one would then construct the 
problem 

+ = * (32) 

B b u b = 

V 

where 

ft = dmg(K bb )L b (L T b dmgiK^L.y 1 (Lift) (33) 

and where (L b ft) are the applied forces statically condensed and assembled 
on the interface. Following a similar discussion as in the previous section, it is 
straightforward to show that such a splitting is equivalent to 

ft = ft - Bl (B b dmg(K bb )- l Bl) + B b diag^)- 1 /* (34) 

indicating again that the splitting corresponds to adding a particular set of 
equilibrated interface forces. 

Let us observe that, in the non-condensed format, (32) is identical to 



2 This relation expresses the complementarity of the primal and dual scaling and 
was already suggested in [?]. Its proof can be obtained by noting that '(30) and 
BA 1 (30) are trivially satisfied. Owing to the fact that A is symmetric positive 
definite and because the image of L corresponds to the nullspace of £?, the final 
result is obtained. 
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Ku + B X 



f-B T ( J Bdiag(K)- 1 J B T ) + Bdiag(K)- 1 



(35) 





ft 



When comparing it with (31), it is clear that splitting the condensed forces is 
not equivalent to splitting / as commonly done. Although the displacement 
field obtained is obviously the same, the Lagrange multipliers searched for 
during the FETI iterations will be different. The question thus arises: what 
splitting should be used and how does it affect the FETI iterations on the dual 
interface problem? Next section provides a theoretical analysis of the interest 
of the splitting while the section after provides related numerical assessments. 



4 An improved initial estimate of the Lagrange multipliers 



Let us consider again the hybrid decomposed problem (12). In order to con- 
struct an estimate for the interface Lagrange multipliers, let us assume that 
the internal degrees of freedom Ui satisfy the local equilibrium while u b on the 
interface boundary have a zero estimate. This is exactly the assumption un- 
derlying the initialization of the primal Schur complement iteration schemes. 
The sub domain equilibrium then writes 

B T b \ * f* b (36) 

which cannot be exactly satisfied (unless u b = corresponds to the solution). 
Hence, let us decompose the statically condensed force f b * into a component 
satisfying the interface equilibrium (i.e. belonging to the image of B%) and a 
remainder: 

Bl\ ~ f* = B%j + remainder (37) 

where 

1 =(B b DBl) + B b Df* b (38) 

for any non-singular and symmetric matrix D. The remainder is then orthog- 
onal to DB]j . If we choose as initial estimate 

Aoo = 7 (39) 

then the initial equilibrium residual -B^Aoo — f b = remainder is minimum in 
the iD-norm. Setting D to be diag(-K^ ) )~ 1 

Aoo = (B b dmg{K bb )- l Bl) + B b &mg(K hb )- l r b (40) 
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can be easily computed by a scaling of the interface condensed forces. Observ- 
ing that diag(-K"f,t) _1 is an approximation of S + , Ao minimizes a norm having 
the meaning of an energy. 

We thus propose to start the FETI iterations with an initial estimate obtained 
by rendering A o admissible, i.e. such that Gj\ = e. Using the notations 
introduced in (17), 

A = P(Q) A 00 + QG T (G%QGj) ' e (41) 

To be consistent with the choice D = diag(K" fe6 ) _1 , one should take Q as in 
(26) or (27). 



Remarks 

• If D = S + in (38), Aoo = Fj + d. Furthermore, one would have Q = F[ + 
and 

A = P(Q)F I + d + F I + G I (GjF I + G i y 1 e 

= Fr + (d-Gra) (42) 
with a = (GjF 7 + G 7 ) _1 (GjF^d - e) 

Hence the new choice of initial estimate would yield the exact solution. 

• If we choose to split the interface forces such as described in (33) (or its 
equivalent form 34), (40) yields A o = so that the initial iterate (41) would 
correspond to the standard FETI starting procedure. Hence it is equivalent 
to apply standard FETI iterations when splitting condensed forces on the 
interface as in (33) or to use any decomposed force vector f together with 
the starting procedure (4-1)- 

• The construction of A o in (40) has exactly the same mechanical interpreta- 
tion as the scaling of interface forces in the preconditioner (see [?]). The cost 
incurred by (33) or (41), corresponds to a preconditioning step, thus less 
than one half of a FETI iteration. Obviously, if the lumped preconditioner 
is applied, the standard splitting (29) should be considered. 



5 Numerical assessment 



In order to assess the performance of the new estimate for the Lagrange multi- 
pliers, we consider the solution to various problems by FETI and BDD. In all 
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cases, convergence is monitored through the evaluation of the global residual: 



\ K 9 u g- fg 



< e (43) 



where K g , u g and f g are global assembled stiffness matrix, displacement field 
and forces (see (13)). All results are obtained with the Dirichlet preconditioner 
equipped with the super-lumped scaling, i.e. A = diag(-K 6 5)~ 1 . The stopping 
criterion e is set to 10~ 6 . The FETI method is tested for its different projectors: 
we note P(Ff^) the Dirichlet projector, P(W) the superlumped projector 
corresponding to (27) and P(I) the identity projector. 



5.0.0.1 Cube with checkerboard heterogeneity: Let us first assess 
the strategies described above on the problem described in the introduction 
(see fig. 1). Figure 4 presents the convergence of the primal residual through 
the conjugate gradient iterations for the BDD and for the FETI approaches 
with standard force splitting. The primal approach clearly exhibit better re- 
sults. Also, due to the strong heterogeneity in the structure and along the 
interfaces, the projector P(I) yields very poor convergence and is thus not 
suitable. It is observed that, although the Dirichlet projector yields a better 
convergence rate, the superlumped projector P(W) has a significantly lower 
initial residual and hence reaches convergence faster. The initialization asso- 
ciated to P(W) leads to a starting residual 10 3 times smaller, however its 
behaviour during the solution process is less regular. 
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Fig. 4. Cube with checkerboard heterogeneity (fig. 1): convergence of the primal 
residual for FETI with standard force splitting 



Figure 5 presents the convergence history for the BDD method and for the 
FETI method equipped with the Dirichlet projector and with the new initial- 
ization. For comparison, the convergence obtained with the standard splitting 
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is shown once more. Note that the new initialization not only provides a better 
initial estimate (lower initial residual) but also leads to similar convergence 
rates as for the standard splitting. Therefore the convergence history of FETI 
becomes very similar to the convergence of the primal BDD approach. The 
total number of iterations is about the same as for the primal BDD, although 
the convergence of FETI is less monotonic. 
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Fig. 5. Cube with checkerboard heterogeneity (fig. 1): convergence of the primal 
residual for FETI with new initialization 



Approach 


Number of 
iterations 


Initial Residual 
(Log) 


BDD 


19 


0.659 


FETI 
Dirichlet 


No Splitting 


28 


4.428 


Classical Splitting 


28 


4.377 


New Initialization 


18 


0.359 


FETI 
Dirichlet 
P(W) 


No Splitting 


21 


0.873 


Classical Splitting 


21 


0.872 


New Initialization 


20 


0.868 


FETI 
Dirichlet 
P(I) 


No Splitting 


74 


5.029 


Classical Splitting 


74 


5.026 


New Initialization 


73 


5.016 



Table 1 

Performance results for the cube of fig. 1 

Table 1 summarizes the performance results of the various available strate- 
gies. To investigate the efficiency of the new initialization (or force splitting) 
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procedure, we compare the number of iterations to achieve convergence and 
the norm of the initial primal residual. The new initialization yields signifi- 
cant improvements for the Dirichlet projector (35% less iterations), but only 
slightly affects the convergence when other projectors are used. 



5.0.0.2 Cube and slanted cube with heterogeneous layers: We now 

assess the new initialization strategy on two other configurations of the cube 
in order to evaluate the influence of the geometry and of the repartition of 
heterogeneities. The structures depicted in Figure 6 and 7 are similar to fig. 
1 but with different material distribution. Also, for the problem described in 
Figure 7, the cube has been slanted by 60 degrees. In table 2 we report the 
number of iterations when using the BDD solver and when applying FETI 
with the standard and the new initialization. 
















El 


E2 







Fig. 6. Cube with heterogeneous Fig. 7. Slanted cube with hetero- 
layers geneous layers 

For the straight cube of fig. 6, the use of the novel initialization leads to only 
small improvements. Observing that the FETI solver with Dirichlet projector 
and standard splitting converges nearly as fast as the BDD, it is clear that 
the new splitting strategy can not have a significant effect. 

For the slanted cube, due to the geometric distortion of the substructures and 
of the mesh, the BDD convergence is better than when the FETI with the 
standard splitting is applied. When the new initialization strategy is used, the 
convergence history of FETI is again very similar to the convergence of the 
primal BDD. 



5.0.0.3 Nonlinear flexion of a composite beam As a last example, we 
analyze a non-linear problem solved through a sequence of linearized systems. 
The structure (fig. 8) is a slender beam of aspect ratio 9 with square cross- 
section. It is made of longitudinal strips of metal and rubber. The loading 
corresponds to an imposed pressure on one side of the beam. Due to the pres- 
ence of elastomer parts, the structure undergoes large deformations. We chose 
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Approach 


Cube, fig. 6 


Slanted cube, fig. 7 






Num. of it. 


Num. of it. 


BDD 


19 


73 




iNo spurting 


20 


85 


Dirichlet 


Classical Splitting 


21 


85 




incw iniiiaiiZHiioii 


19 


73 


FETI 


No Splitting 


22 


88 


Dirichlet 


Classical Splitting 


22 


88 


P(W) 


New Initialization 


22 


85 


FETI 


No Splitting 


92 


153 


Dirichlet 


Classical Splitting 


92 


154 


P(I) 


New Initialization 


90 


159 



Table 2 

Performance results on problem fig. 6 and 7 

a Kirchoff Saint- Venant model for the metal, the characteristic coefficients 
of which are its Young modulus E = 20000 MPa and Poisson's coefficient 
v = 0.3. A NeoHookian model is assumed for the rubber characterized by a 
shear modulus G = 2.0 MPa and a compressibility modulus K = 2000 MPa. 
The structure is decomposed into 27 monomaterial paralellepipedic subdo- 
mains and each substructure is meshed in 2 x 2 x 18 cubic elements. In order 
to handle the quasi-incompressibility of the elastomer, a mixed finite element 
formulation is considered where the pressure field and displacements are dis- 
cretized independently. We choose a Q 2 — Pi hexaedral element with 27 dis- 
placement nodes and 4 internal pressure nodes. Details on this formulation 
and on the practicalities for applying it properly in simulation can be found 
in[?]. 

The complete model has 55300 degrees of freedom of which 16400 belong to 
the interface. Due to the behaviour of the elastomer, the problem is highly 
nonlinear. Newton-Raphson iterations are performed where the tangent ma- 
trix is updated at every step. The pressure loading is 5 bars. The stopping 
criterion is set to 10~ 3 for the relative primal residual of FETI when solv- 
ing the linearized systems. For the Newton-Raphson iterations, the tolerance 
for the relative residual of the nonlinear equations is set to 10~ 4 , so that 5 
Newton-Raphson iterations and thus 5 linear solves must be performed. 

Figure 9 indicates the number of iterations required when solving the linearized 
systems by the classical FETI method, by FETI with improved initialization 
and by the primal BDD. The Dirichlet and diagonal projectors are applied, 
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Fig. 8. 3 x 3 x 3 substructures composite beam 



and the Dirichlet preconditioner with stiffness scaling is used. The significant 
increase of the number of iterations between the first and the following lin- 
earized systems can be explained by the loss of positivity of the tangent matrix 
which is mostly due to incompressibility. Indeed, the conjugate gradient algo- 
rithm (with full re-orthoganilization) applied on the interface problem in the 
FETI and in the BDD methods remains applicable for non positive matrices 
but its convergence is significantly slowed down [?]. 3 As observed from figure 
9, the new initialization enables FETI to achieve performance results which 
are very similar to the BDD method. Comparing the total number of itera- 
tions, the classical FETI approach requires 10% more iterations than BDD 
while the FETI method with new initialization requires slightly less itera- 
tions than the BDD. From figure 9 we also observe that the new initialization 
technique improves the performance of FETI both for the Dirichlet and the 
diagonal projector. For this particular structure with regular geometry, the 
cost effective diagonal projector yields a convergence rate very similar to the 
convergence rate obtained with the more computationally intensive Dirichlet 
projector, except for the very first linearized system solve. 

Our numerical experiments indicate that for a large class of nonlinear problems 
such as the one depicted here, the new initialization leads to a small but 
non-negligible gain in terms of number of iterations and CPU time. Another 
important beneficial effect of the new starting procedure for FETI comes from 
the fact that, since the initial residual of the iterations on the interface problem 
are several orders of magnitude lower with the new initialization, stagnation 
of the residual of the FETI iterations which often happens when dealing with 
higher nonlinearity (higher loading) is significantly delayed, so that in practice 
restarting of the iteration can be avoided. 



3 Often, non-positivity is due to the mutation of former null-modes (rotations) to 
negative modes, it can be handled by the introduction of these modes as constraints 
in Krylov-augmented algorithms [?]. In our case, non-positivity is mostly due to 
the behaviour of the rubber and the strategy described above is non-relevant. An 
efficient strategy based on the approximation of negative modes can be found in [?]. 
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Fig. 9. Performance results for the flexion of the composite beam 



6 Conclusions 

For some particular structural problems such as those exhibiting strong het- 
erogeneities and geometric distortion, the Finite Element Tearing and Inter- 
connecting (FETI) solver can yield poor convergence compared to the concep- 
tually similar Balanced Domain Decomposition (BDD) solver. In those cases, 
the bad performance of FETI can be traced back to high initial residual in the 
iterations on the interface problem. The initial residual is strongly related to 
the way the applied forces are split on the subdomain interface boundaries. 

In this paper, we propose a novel strategy to split the applied forces between 
the subdomains. We propose to split the statically condensed interface force 
according to interface diagonal stiffness. This leads to building a more efficient 
initial estimate for the interface connecting forces. The new initialization for 
the FETI iterations mainly involves computing statically condensed forces on 
the interface and can thus be performed at a computational cost equivalent 
to less than half the cost of a full FETI iteration. 

The numerical examples described in this paper indicate that for the problems 
where the primal BDD method outperforms FETI due to unexpected high 
initial residual, the new initialization strategy builds a better starting estimate 
of the interface forces and, in turn, to an initial residual similar to the BDD 
residual. The FETI method then converges in a manner very similar to the 
BDD method. 

The novel starting strategy never deteriorates the FETI convergence and leads 
to significant improvements in some pathological cases. Therefore we suggest 
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to use the presented initialization as default in FETI solvers. 

With the proposed initialization for FETI, the FETI method and the BDD 
solver lead to similar convergence and computational costs for complex prob- 
lem where the Dirichlet projector is required. For problems where the sim- 
plified FETI preconditioners and projectors can be used without significantly 
deteriorating the convergence of the interface iterations, FETI is often found 
to be more efficient in terms of overall computing cost. 
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an efficient way to define such a splitting and show how it is related to 
the construction of the initial iterate of FETI in section 4. Numerical 
examples are reported in section 5 to illustrate the effectiveness of the 
new initialization strategy. Finally, we present some conclusions. 



2 FETI basics 

2.1 The decomposed problem 

Let us consider a domain O subdivided into N s non-overlapping subdo- 
mams Q (s) and assume that we are solving a linear static equilibrium 
problem on the domain. The discretized subdomain equilibrium is ex- 
pressed by 

K (s) u (s) = f (s) +g (s) s = l,...N s (2) 

where K (s \ and are the subdomain stiffness matrices, displace- 
ments and applied forces respectively. g^ s ' are the connecting forces on 
the interface between subdomains. For the sake of simplicity, we assume in 
the following that the meshes are matching (conforming) on the interface. 

The interface forces satisfy an interface equilibrium equation express- 
ing that when assembled on the interface, the resultant is null (action- 
reaction): 

^L (s) V s) = (3) 

s = l 

where L*-"' is a Boolean assembly matrix. The interface connecting forces 
are such that the interface degrees of freedom are compatible, namely 

^B w « (s) =0 (4) 

s = l 

This relation expresses that for any pair (u (a \ u^ r ') of degrees of freedom 
matching and the interface, u (s) -u (r) = 0. B^ s ' are thus signed Boolean 
matrices expressing the compatibility constraints on the interface. 

The equilibrium problem of domain O is fully described by the local 
equilibrium (2) and by the interface constraints (3, 4). In block diagonal 
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notations, it can be summarized as 




f + 9 







(5) 



where K is the block diagonal matrix of the local operators K ^ and 
where 
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(6) 



Note that in this description, one set of interface displacements and one 
set of interface forces are defined per subdomain. 



2.2 Solvers for decomposed problems 

Solving (5) can be done in several ways: 

• Considering (5) as a constrained equilibrium problem in terms of u 
and g leads to the three-field formulation of decomposed domains 
(see e.g. [PJF97,RFTM99]). 

• One can choose to work with a displacement set u that satisfies a 
priori the interface compatibility (4). For that purpose we define a 
global set u g of degrees of freedom unique on the interface such that 

= L^Ug or u = Lu g (7) 

where is the same assembly Boolean matrix as in (3) that relates 
subdomain degrees of freedom to the global set. Stating that 
are obtained from a unique set is obviously equivalent to stating the 
interface compatibility (4) and (7) thus implies 

Bu = BLug = (8) 

for any global displacement u g . On the other hand, all compatible 
displacements can be written as in (7). Hence 

L = null(B) (9) 

In order to illustrate these concepts, we reproduce the example of 
[Rix02a] in Figure 3. Introducing (7) in (5) yields 

f KLu g = f + g 

I L T g = (10) 

This set of equations is at the basis of the primal iterative solu- 
tion techniques such as the Primal Schur Complement or the BDD 
methods [TRV91, Man93]: iteration schemes are applied to find the 
displacements u g until the interface equilibrium L T g = L T (K Lu g — 
f) = is satisfied. 
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Figure 3: Lagrange multipliers and interface compatibility (from [Rix02a]) 
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• One can choose in (5) a set of interface forces satisfying a priori the 
interface equilibrium L T g = while keeping redundant interface 
degrees of freedom in u. According to (9), such interface forces have 
the generic expression 

g W = - B (S)T X or g = -B T \ (11) 

A are interface forces that act in opposite directions between any pair 
of matching degrees of freedom on the interface and are therefore in 
equilibrium (see Figure 3). Problem (5) becomes 

{ K " +B 2 : I 

Clearly, A are the Lagrange multipliers associated to the interface 
compatibility constraints. This form of the decomposed problem is 
the basis for the dual procedures such as FETI: iterative algorithms 
are applied to compute the interface forces A such that the displace- 
ments resulting from the subdomain equilibrium are compatible on 
the interface. 

• If one chooses interface displacements that are unique on part of 
the interface while, on the remainder of the interface, equilibrated 
connecting forces are defined, one obtains hybrid primal/dual ap- 
proaches such as the FETI-DP procedure [FLL+01]. 

• If on the entire interface we use displacements and forces that satisfy 
a linear combination of the interface equilibrium and compatibility 
(e.g. Robin type of boundary conditions), one obtains formulations 
typically used in wave propagation analysis such as described for 
instance by Helmholtz equations [dLBFM + 98]. 

• Finally if both the interface equilibrium and compatibility are en- 
forced a priori, one obtains the fully assembled form 

L T KLu g = L T f (13) 

Let us note that in the case where interfaces are non-matching, the 
relations above remain valid, but the interface assembly and constraint 
matrices (L and B) are no longer Boolean (see for instance [BMP89]). 



2.3 FETI: the dual iterative solver 

In the FETI method [FR94], the decomposed problem (12) is expressed 
in terms of interface forces A: using the subdomain equilibrium equations 
to eliminate u (s \ 

u (s) = K W+ ( /W _ B (s)T^ _ R(s)a( s) (M) 

where K is the inverse of or a generalized inverse if subdomain 
O^-* is floating when disconnected from its neighbors. In the latter case, 
are the associated rigid body modes, their amplitudes being de- 
termined such that the interface forces are in equilibrium with the applied 
forces f^, i.e. such that the subdomain equilibrium is well-posed: 

R (s) T ^f(s) _ B (s) T X ^ = (15) 
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